34 research outputs found

    Subjective and objective quality evaluation of compressed medical video sequences

    Get PDF
    Existing objective video quality metrics such as VQM from NTIA [1] and MOVIE [2] are known to perform well for assessing compression degradation in natural scene and broadcast television sequences but their suitability for the quality evaluation of compressed medical video has not been studied extensively. In this work we assess the quality of compressed medical video sequences using objective metrics and a subjective evaluation study conducted with non-expert subjects. Test sequences consist of High Definition medical video of laparascopic surgery. Four compression types (Motion JPG and three variants of H.264) at four bit-rates (5, 12, 20, and 45 Mbps) are studied and compared to original uncompressed sequences. One reduced reference metric (VQM) and one full-reference metric (MOVIE) are studied. Subjective video evaluation consists of overall quality scores as well as difference scores between compressed and uncompressed sequences for similarity and five types of artifacts or attributes: blurring, blocking, noise, color fidelity, and motion artifacts. The results of the subjective and objective evaluations exhibit similar trends across the compression types and bit-rates, and may indicate that these objective quality metrics may be valid reflections of subjective quality judgments made by non-expert observers on compressed medical video sequences. In future work we will expand the subjective quality evaluation to include expert laparoscopic surgeons as subjects

    Effects of static and dynamic image noise and background luminance on letter contrast threshold

    Get PDF
    We performed a pilot psychovisual experiment to determine the contrast threshold and slope of the psychometric function for a target embedded in two levels of static and dynamic external image noise. Sloan letters were presented in a local background surrounded by a global background, both varied over four luminance levels: 58.62, 155.97, 253.50, and 347.47 candela per square meter. Uncorrelated Gaussian noise with normalized standard deviation 0.019 and 0.087 was added to the stimuli. A noise-free stimulus was also tested. No systematic effect of global background luminance was found. The contrast threshold was approximately 1% in the noise-free stimulus and increased monotonically with rms noise contrast, following a power law relationship. Thresholds were higher in static noise. The model will be incorporated in a no-reference, task-based medical quality metric for x-ray sequences

    Influence of study design on digital pathology image quality evaluation : the need to define a clinical task

    Get PDF
    Despite the current rapid advance in technologies for whole slide imaging, there is still no scientific consensus on the recommended methodology for image quality assessment of digital pathology slides. For medical images in general, it has been recommended to assess image quality in terms of doctors’ success rates in performing a specific clinical task while using the images (clinical image quality, cIQ). However, digital pathology is a new modality, and already identifying the appropriate task is difficult. In an alternative common approach, humans are asked to do a simpler task such as rating overall image quality (perceived image quality, pIQ), but that involves the risk of nonclinically relevant findings due to an unknown relationship between the pIQ and cIQ. In this study, we explored three different experimental protocols: (1) conducting a clinical task (detecting inclusion bodies), (2) rating image similarity and preference, and (3) rating the overall image quality. Additionally, within protocol 1, overall quality ratings were also collected (task-aware pIQ). The experiments were done by diagnostic veterinary pathologists in the context of evaluating the quality of hematoxylin and eosin-stained digital pathology slides of animal tissue samples under several common image alterations: additive noise, blurring, change in gamma, change in color saturation, and JPG compression. While the size of our experiments was small and prevents drawing strong conclusions, the results suggest the need to define a clinical task. Importantly, the pIQ data collected under protocols 2 and 3 did not always rank the image alterations the same as their cIQ from protocol 1, warning against using conventional pIQ to predict cIQ. At the same time, there was a correlation between the cIQ and task-aware pIQ ratings from protocol 1, suggesting that the clinical experiment context (set by specifying the clinical task) may affect human visual attention and bring focus to their criteria of image quality. Further research is needed to assess whether and for which purposes (e.g., preclinical testing) task-aware pIQ ratings could substitute cIQ for a given clinical task

    Effects of common image manipulations on diagnostic performance in digital pathology: human study

    Get PDF
    A very recent work of Ref.[1] studied the effects of image manipulation and image degradation on the perceived attributes of image quality (IQ) of digital pathology slides. However, before any conclusions and recommendations can be formulated regarding specific image manipulations (and IQ attributes), it is necessary to investigate their effects on the diagnostic performance of clinicians when interpreting these images. In this study, 6 expert pathologists interpreted digital images of H&E stained animal pathology samples in a free-response (FROC) experiment. Participants marked locations suspicious for viral inclusions (inclusion bodies) and rated them using a continuous scale from 0 (low confidence) to 100% (high confidence). The images were the same as in Ref.[1]: crops of digital pathology slides of 3 different animal tissue samples, all 1200Ă—750 pixels in size. Each participant viewed a total of 72 images: 12 nonmanipulated (reference) images (4 of each tissue type), and 60 manipulated images (5 for each reference image). The extent of artificial manipulations was adjusted relative to the reference images using the HDR-VDP metric [2] in the luminance domain: added Gaussian blur (sb=3), decreased gamma (-5%), added white Gaussian noise (sn=10), decreased color saturation (-5%), and JPG compression (libjpeg 50). The images were displayed on a 3MP medical color LCD in a controlled viewing environment. Preliminary analysis assessing the change in the number of positive markings in the reference and manipulated images indicates that blurring and changes in gamma, followed by changes in color saturation, could have an effect on diagnostic performance. This largely coincides with the findings from Ref.[1], where IQ ratings appeared to be most affected by changes in color and gamma parameters. Importantly, diagnostic performance appears to be content dependent; it is different across tissue types. Further data analysis (including JAFROC) is ongoing and shall be reported in the conference talk

    Evaluation of color differences in natural scene color images

    Get PDF
    Since there is a wide range of applications requiring image color difference (CD) assessment (e.g. color quantization, color mapping), a number of CD measures for images have been proposed. However, the performance evaluation of such measures often suffers from the following major flaws: (1) test images contain primarily spatial- (e.g. blur) rather than color-specific distortions (e.g. quantization noise), (2) there are too few test images (lack of variability in color content), and (3) test images are not publicly available (difficult to reproduce and compare). Accordingly, the performance of CD measures reported in the state-of-the-art is ambiguous and therefore inconclusive to be used for any specific color-related application. In this work, we review a total of twenty four state-of-the-art CD measures. Then, based on the findings of our review, we propose a novel method to compute CDs in natural scene color images. We have tested our measure as well as the state-of-the-art measures on three color related distortions from a publicly available database (mean shift, change in color saturation and quantization noise). Our experimental results show that the correlation between the subjective scores and the proposed measure exceeds 85% which is better than the other twenty four CD measures tested in this work (for illustration the best performing state-of-the-art CD measures achieve correlations with humans lower than 80%)

    Computing contrast ratio in medical images using local content information

    Get PDF
    Rationale Image quality assessment in medical applications is often based on quantifying the visibility between a structure of interest such as a vessel, termed foreground (F) and its surrounding anatomical background (B), i.e., the contrast ratio. A high quality image is the one that is able to make diagnostically relevant details distinguishable from the background. Therefore, the computation of contrast ratio is an important task in automatic medical image quality assessment. Methods We estimate the contrast ratio by using Weber’s law in local image patches. A small image patch can contain a flat area, a textured area or an edge. Regions with edges are characterized by bimodal histograms representing B and F, and the local contrast ratio can be estimated using the ratio between mean intensity values of each mode of the histogram. B and F are identified by computing the mid-value between the modes using the ISODATA algorithm. This process is performed over the entire image with a sliding window resulting in a contrast ratio per pixel. Results We have tested our measure on two general purpose databases (TID2013 [1] and CSIQ [2]) to demonstrate that the proposed measure agrees with human preferences of quality. Since our measure is specifically designed for measuring contrast, only images exhibiting contrast changes are used. The difference between the maximum of the contrast ratios corresponding to the reference and processed images is used as a quality predictor. Human quality scores and our proposed measure are compared with the Pearson correlation coefficient. Our experimental results show that our method is able to accurately predict changes of perceived quality due to contrast decrements (Pearson correlations higher than 90%). Additionally, this method can detect changes in contrast level in interventional x-ray images acquired with varying dose [3]. For instance, the resulting contrast maps demonstrate reduced contrast ratios for vessel edges on X-ray images acquired at lower dose settings, i.e., lower distinguishability from the background, compared to higher dose acquisitions. Conclusions We propose a measure to compute contrast ratio by using Weber’s law in local image patches. While the proposed contrast ratio is computationally simple, this approximation of local content has shown to be useful in measuring quality differences due to contrast decrements in images. Especially, changes in structures of interest due to low contrast ratio can be detected by using the contrast map making our method potentially useful in Xray imaging dose control. References [1] Ponomarenko N. et al., “A New Color Image Database TID2013: Innovations and Results,” Proceedings of ACIVS, 402-413 (2013). [2] Larson E. and Chandler D., "Most apparent distortion: full-reference image quality assessment and the role of strategy," Journal of Electronic Imaging, 19 (1), 2010. [3] Kumcu, A. et al., “Interventional x-ray image quality measure based on a psychovisual detectability model,” MIPS XVI, Ghent, Belgium, 2015

    Multi-modal measurement of cortical thickness in brain MRI for Focal Cortical Dysplasia detection

    Get PDF
    In this work we aim to improve the detection of Focal Cortical Dysplasia on MRI images using a multimodal approach. We propose to estimate the thickness of the cortex jointly using partial volume maps of T1-weighted MPRAGE and T2- weighted FLAIR images by fitting spheres into the gray matter of the brain such that the amount of probability-weighted gray matter contained in each sphere is maximized. Results on nine patients show that the FCD lesions for all patients could be detected using the multimodal approach compared to T1 alone (FCD detected in only 7 patients) and Freesurfer (4 patients)
    corecore